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Abstract —We present a high-rate {n, k,d = n — 1)-MSR code 
with a suh-packetization level that is polynomial in the dimension 
k of the code. While polynomial suh-packetlzatlon level was 
achieved earlier for vector MDS codes that repair systematic 
nodes optimally, no such MSR code construction is known. In 
the low-rate regime (1. e., rates less than one-half), MSR code 
constructions with a linear suh-packetization level are available. 
But in the high-rate regime (i. e., rates greater than one-half), 
the known MSR code constructions required a suh-packetization 
level that is exponential in k. In the present paper, we construct 
an MSR code for d = n — 1 with a fixed rate R = t > 2, 
achieveing a suh-packetization level a = 0(k^). The code allows 
help-hy-transfer repair, 1. e., no computations are needed at the 
helper nodes during repair of a failed node. 

Index Terms —Distrihuted storage, regenerating codes, suh- 
packetization, msr. 

I. Introduction 

In a distributed storage system, the data file comprising 
of B data symbols drawn from a finite field F^, is encoded 
using an error-correcting code of block length n and the code 
symbols are stored in n nodes of the storage network. A naive 
strategy aimed at achieving resilience against node failures is 
to store multiple replicas of the same data. Given the massive 
amount of data being stored, sophisticated codes such as Reed- 
Solomon (RS) codes with low storage overhead are being 
employed in practice. However, the amount of data download 
required to repair a single node-failure is quite large for the RS 
codes. The framework of regenerating codes was introduced 
in III to address this problem. In an (n, fc, d)-regenerating 
code, a file comprised of B symbols from a finite field F, 
is encoded into a set of na coded symbols and then stored 
across n nodes in the network with each node storing a coded 
symbols. The parameter a is termed as the sub-packetization 
level of the code. A data collector can download the data 
by connecting to any k nodes. In the event of node failure, 
node repair is accomplished by having the replacement node 
connect to any d nodes and download [3 < a symbols from 
each node with a < dfd < B. The quantity df3 is termed 
the repair bandwidth. Here one makes a distinction between 
functional and exact repair. By functional repair (ER), it is 
meant that a failed node will be replaced by a new node 
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such that the resulting network continues to satisfy the data 
collection and node-repair properties defining a regenerating 
code. An alternative to function repair is exact repair (ER) 
under which one demands that the replacement node store 
precisely the same content as the failed node. 

A cut-set bound based on network-coding concepts, tells 
us that given a code parameter set {n,k,d), the maximum 
possible size of a data file under ER, is upper bounded HI by 

k 

B < ^ minjo;, (d — £-I-1)^}. (1) 

t=i 

The above bound is tight since the existence of codes achieving 
this bound has been established using network-coding argu¬ 
ments related to multicasting. Eor fixed values of {n,k,d, B), 
the bound in O characterizes a tradeoff between a and /3, 
referred to as the Storage-Repair Bandwidth tradeoff. The two 
extremal points in the tradeoff are respectively, the minimum- 
storage regenerating (MSR) and minimum bandwidth regener¬ 
ating (MBR) points which correspond to the points at which 
the storage and repair bandwidth are respectively minimized. 
At MBR point, we have 

a = dp, B = ka- (2) 

and at MSR point, we have 

a = {d — k 1)P, B = ka. (3) 

It is proved that MSR and MBR points are achievable by ER 
codes as well. The focus of the current paper is on ER MSR 
codes and for convenience we simply refer to them as MSR 
codes. 

A. MSR Codes 

The MSR codes can be considered as codes over a vector 
alphabet F^c with dimension k. Since they tolerate any {n—k) 
node-erasures, and they have a file size of B = ka, MSR codes 
are Maximum-Distance-Separable (MDS) codes over the vec¬ 
tor alphabet F^c. The combination of these two properties is 
therefore called the MDS property of MSR codes. On the other 
hand, MSR codes in addition to being vector MDS codes can 
repair a failed node with the least possible repair bandwidth. 

The construction of MSR codes is a well-studied problem 
in literature. In 121, a framework to construct MSR codes is 
provided for d > 2fc - 2. In 0, high-rate MSR codes with 


parameters {n,k = n — 2, d = n — 1) are constructed using 
Hadamard designs. In ID, high-rate MSR codes, known as 
zigzag codes, are constructed for d = n — 1; here efficient 
node-repair is guaranteed only in the case of systematic nodes. 
This was subsequently extended to include the repair of parity 
nodes as well in 0. A construction for MSR codes with d = 
n — 1 > 2k — 1 using techniques of interference alignment is 
presented in 0 and Q. In 0, authors showed the existence 
of MSR codes for any value of (n, k, d). 

B. Our Approach On Sub-packetization and Contributions 

A parameter of interest for MSR codes is the amount of 
sub-packetization (a) required for a given value of {n,k,d). 
The MSR constructions known as zigzag codes that allow 
arbitrarily high rates required a sub-packetization level that 
is exponential in k. Later in 0, a vector MDS codes that 

k 

repair systematic nodes was constructed achieving a = r'-+i 
where r := n — k. Recently in uni, another vector MDS 
code that repairs systematic nodes optimally was proposed 
satisfying an additional property known as access-optimality. 
The construction required a = r~. In mu, authors derived a 
lower bound on the sub-packetization in terms of k, and r as 
given below: 

21og.,Q:(log^ rjg + D + l > k. 

Earlier in ma, authors constructed a vector MDS code with 
rate i? = |, requiring an a that is polynomial in k. They could 
also achieve polynomial a for any fixed rate in the regime 
I < R < 1. However, these codes were also limited by the 
fact that optimal repair was feasible for systematic codes alone. 
Quite similar to the approach in Ha, we also restrict our focus 
to the family of MSR codes with a fixed rate R = >2. 

It is worthwhile to remark at this point that the family of 
Product-Matrix MSR codes 0 with rate restricted by i? < ^ 
required only a linear sub-packetization level. In the present 
paper, we construct a (n, k,d = n — 1)-MSR code with a fixed 
rate R = where f > 2 is an integer parameter. The code 
will have a = [j) . To the best of our knowledge, these are 
the first MSR constructions that achieve a sub-packetization 
level that is polynomial in k. These codes are help-by-transfer 
codes, by which we mean that the helper nodes need not do 
any computation during the repair of a failed node. 

II. MSR Code Construction For Rate= ^ 

In this section, we provide the construction for MSR codes 
with a rate, R = for some positive integer t. The 
construction is described for a particular example of f = 3, 
and subsequently generalized. 

A. Code Construction for i? = | 

We have an auxiliary parameter q = p"* for some prime p, 
and m a positive integeo Then the code has parameters 

n = 3q, k = 2q, d = {n — 1)., a = q^. 

*The auxiliary parameter takes values from a finite-field, though it is 
sufficient to work with a finite-ring. This does not cause any lack of generality 
in the principles used for the construction. 


A codeword of an MSR code can be treated as an array of 
size {ax n). We first introduce an indexing for the rows and 
columns (nodes and columns are often used interchangeably) 
of the codeword array. Let Fg = {0,1,..., g — 1} denote a fi¬ 
nite field of size q, and a 2-tuple (i, 0), i £ {1, 2, 3}, 0 G Fg is 
used to index the columns. The rows are indexed by elements 
{x,y,z) from F^ where x,y,z G Fg. Thus C(x, y, z; (i, d)) 
represents one code symbol from the codeword array at the 
intersection of the row (x,y,z) and the node (i,0). In order 
to describe the code, we first introduce the following notation 

n 

oi © 02 0 • • • © a„ := ^ Ci.Oj, Ci f 0, Vi G [n] (4) 

i=l 

to denote a linear combination involving each of the scalars 
in {oi, 02 ,..., Un} with non-zero coefficients. The notation is 
oblivious to the particular choice of non-zero coefficients in 
the linear combination. The code is described by parity- 
check constraints. Throughout this paper, the symbol ^ is 
used with a different meaning. The terms within a ^ are not 
connected by the binary operator +, but by the ® operator as 
defined in (@1). For every {x,y,z) G F3, 

^ C{x,y,z;{l,e)) © ^ C{x,y,z-,{2,e)) ® 

SeF, eeF, 

Y,C{x,y,z-,{3,0)) = 0, (5) 

C{x - A,y,z;{l,x)) © C{x,y - A, z; {2,y)) ® 
C{x,y,z - A]{3,z)) © C{x,y,z-,{l,e)) ® 

X] V’ ^)) ® X] y’ 

SGF, SgF, 

(6) 

The parity-check constraint in 0 is referred to as the row- 
parity, and the that in 0 is referred to as the A-parity. It can 
be observed that the first three terms in the A-parity equations 
are entries that do not belong to the {x, y, 2 :)-row. These entries 
are referred to as the shifted entries. What remains is the 
identification of coefficients in these parity-check constraints 
so that the MDS property holds. Instead of constructing these 
coefficients explicitly, we will show in Sec. III-A2I that such 
coefficients indeed exist in a sufficiently large field. Therefore, 
the description of the code is complete with 0, 0. 

In the zigzag code ||4l, parity symbols are categorized into 
two types, namely row-parities and zigzag parities. The row 
parities are made up of message symbols from the same row 
of the codeword array. But the zigzag parities are made up 
of message symbols belonging to various rows such that one 
message symbol is picked per column. In our construction 
also, every parity-check constraint corresponding to A ^ 0 
involves shifted entries that do not belong to the row under 
consideration. In this manner, our construction is of a similar 
flavor as that in 0. But the major difference of our con¬ 
struction from the zigzag construction lies in the symmetry 




of the parity-check constraints. It also differs in the fact that 
two symbols of the same column can be involved in the same 
parity-check constraint in the case of A 7 ^ 0. Such an approach 
was earlier adopted in Qo). 

1) Optimal Repair of a Failed Node : Without loss of 
generality, assume that the node (l,0o) failed. We download 
symbols belonging the rows F = {( 0 o,y, z) \ y,z € F^}. 
Clearly |r| = q^. Thus we have {C{9Q,y,z-,{i,9)) \ i = 
1,2,3, 0 7 ^ 9Q,y,z G Fg}. The rows are selected such that 
X = 9o, because the first coordinate of the index of the node 
is 1. If the first coordinate had been 2 or 3, we would have 
fixed y = 9 q ox z = 9 q respectively. All the code symbols 

are repaired using the row-parities. Hence we have all the 
symbols belonging to rows in F from all the n nodes. Next, 
let us write the equation for A-parity, A G F* corresponding 
to an arbitrary row (9o, y, z) G Jd. 

C(9o - A,y,z;(l,9o)) © C(9o,y - A,z; (2,y)) © 

C{9o,y,z- A; (3,z)) © ^ C{9o,y,z; (1,0)) © 
eeF, 

Y,C{9o,y,z;(2,9)) © ^ C(0o, 2/, 2 ; (3,0)) = 0. (7) 

Except the term C{9o — A,y,z; (l,0o)), all other symbols 
involved in (|2l) are known to us. Thus C{9o — A, y, z; (1,0o)) 
can be repaired for all choices of y, z. By making use of all 
the A-parities, we can thus repair all the remaining symbols 
in the node (l,0o). The total number of symbols downloaded 
per node is 


and thus the repair is bandwidth-optimal. 

2) The MDS Property: In this section, we will show that 
we can find an assignment of coefficients to the row-parities 
and A-parities such that the code satisfies the MDS property. 
We start with stating a useful fact. 

Lemma 2.1: Let iF be a ((n — k) X n)-parity-check matrix 
of a linear code C. If S' C [n], |S| = (n — k) is such that 
rank{H |s) = (n — k), then it is possible to decode every 
codeword of C accessing symbols belonging to locations = 
[n]\S. 

Based on the parity-constraints in (|5]l, (|6]l, we will determine 
the structure of the parity-check matrix H. First, we vectorize 
the codeword array node-by-node so that the first a = 
columns of H represent the first node, the second q^ columns 
represent the second node and so on. The group of q^ columns 
associated with a node is referred to as a thick column. The 
parity-check matrix thus obtained will be of size x Sq'^) 
with n thick columns each containing a thin columns. In order 
to describe the support and thereby the structure of H, we will 
for a moment assume that all the coefficients are set to 1. This 
matrix is denoted by Hg, and is given by 


where the matrices J and E are given in ® and (|3- The 
equation (H also illustrates the fact that the rows of J can 
be decomposed into blocks of size q^, each corresponding 
to parity-check constraints with a fixed A. The first set of 
q^ parity-check constraints correspond to row-parities possibly 
associated with A = 0. 

In ([Sll, ©j Iq3, 0 g 3 respectively represent identity and all¬ 
zero matrix of size q^ x q^. The matrices {Egg \ i G 
{1,2, 3}, (5 G F*,0 G Fg} are made up of Is and zeros, and 
represent the shifted entries of the corresponding A-parities. 
The matrices J, E and hence H are block matrices of size 
(g X 3g) where each block is a square matrix of size q^. We 
will show that the MDS property can be ensured by assigning 
suitable coefficients to locations identified by the support of 
Hg. Our method is quite similar to the method used in IfTOll . 
By 12.11 it is sufficient that H restricted to any {n — k) = q 
thick columns has a rank equal to qa = q'^. Let us assign an 
indeterminate c to all the locations determined by the support 
of E. Now consider the square submatrix Ho obtained by 
restricting H to D C [n], |Z1| = {n — k) thick columns. If we 
assume that all the coefficients of J are fixed, the determinant 
of Ho will be a polynomial in the indeterminate c. Let us 
denote this polynomial by po{c). In the following lemma, we 
prove that po{c) can be made a non-zero polynomial for every 
choice of C [n], |£)| = n — k. 

Lemma 2.2: There exists an assignment of coefficients to 
J such that po{c) is a non-zero polynomial for every choice 
of U C [n], \D\ = n — k. 

Proof: Consider a [3g, 2g]-RS code and its parity-check 
matrix iFmds of size (g x 3g). Clearly a (g x g)-matrix obtained 
by restricting iFmds to any g columns has full rank. Let A^B 
denote the Kronecker product of matrices A and B. If we set 
J to Jo, 

Jo = JFmds © ( 10 ) 

then we must have pd( 0 ) evaluating to a non-zero value for 
every choice of D C [n], |D| = n — k. Hence po{c) must be 
a non-zero polynomial for every valid choice of D. ■ 

Henceforth, we assume that the coefficients of the polynomials 
Po{c) are fixed by the coefficients of J as determined by 
Lemma 12.21 By the structure of E, it is clear that 

deg{po{c)) < q^ - q^. (11) 

Next, consider the polynomial 

pic) = Pd{c). (12) 

0<Z[n],\D\=k 

Clearly p(c) is not identically zero, and its degree is upper 
bounded by ()!)g^(g — !)■ Hence it is sufficient that we find 
an non-zero assignment cq 7 ^ 0 for c such that p(co) f- 0 . 
By Combinatorial Nullstellansatz 03, this is possible if we 
choose the field size greater than (fc)g^(g — 1) + 1- Thus we 
have proved the following theorem. 

Theorem 2.3: There exists an assignment for the coeffi¬ 
cients in the parity-check constraints in Q, ® such that the 
code described in III-AI is an MSR code. 


H. 


J + E^ 





A = 0 

Iq3 • • • Iq3 

Iq3 ■ ■ ■ Iq3 

Iq3 ■ ■ ■ Iq3 
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0,3 

0,3 
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0,3 

0,3 

Ell • 

• Elg 
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Eh • 





^9-1.1 ■ 


Eq-1,1 ■ 

-^9-1,9 

E'^ 

^q—l,q 


Using the constant cq guaranteed in the proof of Thru. 12.31 
and Jo in (fTOl i. the parity-check matrix H of the MSR code 
takes the form 

H = Jo + coE. (13) 

B. Code Construction for R = ,t > 2 

The principle of the construction is elucidated in the last 
section completely, and the generalization to the case of rate 
R = ,t >2 is straightforward. For an auxiliary parameter 
q = p™ for some prime p, and m a positive integer, the code 
construction has parameters 

n = tq, k = (f — l)q, d = (n— 1 ), a = q*. 

A 2-tuple (i, 0),i G {1^2,... ,t}, 0 G¥q is used to index the 
columns. The rows are indexed by elements (xi,X 2 , •. ■ ,xt) 
from F* where xj € F,. Thus C{xi,X 2 ,. ■. ,Xt',{i,0)) rep¬ 
resents one code symbol from the codeword array at the 
intersection of the row {xi,X 2 , ■. ■ ,xt) and the node {i,0). 
The code is described by parity-check constraints. For 
every x = {xi,X 2 , ...,xt) G F‘, 

E E ( 2 , 0 ))© 

SGF, 9e¥q 

••• © E ^(s(i, 0 )) = 0 , (14) 

sgf, 

C(a:i-A,a; 2 , ...,Xt\ (l,xi)) © C{xi,X2-^, ■■■,Xt; ( 2 , 0 : 2 )) 

••• © C{xi,X 2 ,...,xt - A-{t,xt)) © C{x; (1,0)) © 

eeF, 

y^C(x;( 2 , 0 )) © •••© E ^eF*. 

Sgf, seF, 

(15) 

The parity-check constraint in (fT4l i is referred to as the row- 
parity, and the that in (fTSl l is referred to as the A-parity. As 
in the special case of f = 3 described in Sec. Ill-AI the first t 
terms in the A-parity equations are entries that do not belong 
to the (xi, 0 : 2 ,..., Xt)-row. These entries are referred to as 
the shifted entries. Existence of coefficients for parity-check 
equations that ensure MDS property follows in the same line as 


that in Sec. Ill-A2] What remains is to present a repair strategy 
that is bandwidth-optimal. 

1) Optimal Repair of a Failed Node : Assume that the 
node (* 0 ,0o) failed. We download symbols belonging the rows 
F = {x I Xj G Fq,Vj 7^ io, = 0o}. Clearly |r| = 

Thus we have {C{x; (*,0)) | (*,0) f- {io,9f),x G F}. All the 
code symbols 

(^(x; (*o, 0 o)), xGF 

are repaired using the row-parities. Then we have all the 
symbols belonging to rows in F from all the n nodes. Next, 
let us write the equation for A-parity, A G F* corresponding 
to an arbitrary row x G F. 

C(xi - A,X 2 , ... ,Xt; (l,xi)) © ••• © 

C'(xi,...,Xio - A,...,xt;(io,0o)) © ••• © 
C(xi,X 2 ,... ,xt - A; (f,xt)) © C{x;{j,9)) = 0. 

«eF,,jG[t] 

( 16 ) 

Except the term C{xi,... ,Xig — A,... ,Xt; (to,^ 0 )), all other 
symbols involved in (fTST l are known to us. By varying A, we 
can thus repair all the remaining symbols in the node (iq, 0 o)- 
The total number of symbols downloaded per node is 


and thus the repair is bandwidth-optimal. 
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